AntispamLab - A Tool for Realistic Evaluation of Email Spam Filters

نویسندگان

  • Slavisa Sarafijanovic
  • Luis Hernández
  • Raphael Naefen
  • Jean-Yves Le Boudec
چکیده

The existing tools for testing spam filters evaluate a filter instance by simply feeding it with a stream of emails, possibly also providing a feedback to the filter about the correctness of the detection. In such a scenario the evaluated filter is disconnected from the network of email servers, filters, and users, which makes the approach inappropriate for testing many of the filters that exploit some of the information about spam bulkiness, users’ actions and social relations among the users. Corresponding evaluation results might be wrong, because the information that is normally used by the filter is missing, incomplete or inappropriate. In this paper we present a tool to test spam filters in a very realistic scenario. Our tool consists of a set of Python scripts for Unix/Linux. The tool takes as inputs the filter to be tested and an affordable set of interconnected machines (e.g., PlanetLab machines, or locally created virtual machines). When started from a central place, the tool uses the provided machines to build a network of real email servers, installs instances of the filter, deploys and runs simulated email users and spammers, and computes the detection results statistic. Email servers are implemented using Postfix, a standard Linux email server. Only per-email-server filters are currently supported; testing per-email-client filters would require additional development of the tool. The size of the created emailing network is constrained only by the number of available PlanetLab or virtual machines. The run time is much shorter then the simulated system time, due to a time scaling mechanism. Testing a new filter is as simple as installing one copy of it in a real emailing network, which unifies the jobs of a new filter development, testing and prototyping. As an example of how to use the tool, we test the SpamAssassin filter.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection

Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...

متن کامل

A Mail Client Plugin for Privacy-Preserving Spam Filter Evaluation

We describe a plugin extension to the Thunderbird Mail Client to support standardized evaluation of multiple spam filters on private mail streams. Researchers need not view or handle the subject users’ messages and subject users need not be familiar with spam filter evaluation methodology. All that is required of the user is to install the plugin as a standard extension and to run it on his or ...

متن کامل

Personalised, Collaborative Spam Filtering

The state of the art sees content-based filters tending towards collaborative filters, whereby email is filtered at the MTA with users feeding information back about false positives and negatives. While this improves the ability of the filter to track concept drift in spam over time, such approaches make assumptions implicit in centralised spam filtering, such as that all users consider the sam...

متن کامل

Email Spam Detection: a Symbiotic Feature Selection Approach Fostered by Evolutionary Computation

The electronic mail (email) is nowadays an essential communication service being widely used by most Internet users. One of the main problems affecting this service is the proliferation of unsolicited messages (usually denoted by spam) which, despite the efforts made by the research community, still remains as an inherent problem affecting this Internet service. In this perspective, this work p...

متن کامل

Filtering Email Spam in the Presence of Noisy User Feedback

Recent email spam filtering evaluations, such as those conducted at TREC, have shown that near-perfect filtering results are attained with a variety of machine learning methods when filters are given perfectly accurate labeling feedback for training. Yet in realworld settings, labeling feedback may be far from perfect. Real users give feedback that is often mistaken, inconsistent, or even malic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007